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1  Introduction 

The  pcn'asive  influence  of  VLSI  in  the  computer  science  community  has  given  research  on  parallel 
computation  its  second  wind.  In  contrast  with  the  traditional  conception  of  parallel  systems,  where  several 
computers  arc  each  assigned  complicated  tasks,  VLSI  computation,  especially  of  systolic  nature,  involves  the 
simultaneous  use  of  a  great  number  of  very  simple  processors  [MC.K]. 

As  commonly  referred  to,  systolic  arrays  are  one*  or  two-dimensional  arrangements  of  simple  cells  locally 
connected  [K,K1,KL,L]^  The  essential  features  of  systolic  cells  arc  their  simplicity,  neguiarity,  and  modularity. 
Performancc*wise.  these  characteristics  are  deflnite  assets,  as  they  ensure  high  levels  of  pipelining  and 
multiprocessing,  hence  providing  massive  parallelism.  They  also  afftet  the  economics  of  the  approach  by 
making  circuit  development  more  cost-efleedve.  Indeed,  with  dropping  costs  of  electronic  components  and 
increasing  levels  of  circuit  integradon,  systems  designers  are  facing  the  prospect  of  putting  hundreds  of 
thousands  of  gates  on  a  single  chip,  which  so  far  consdtutes  a  formidable  challenge.  Systolic  architectures  are 
one  answer  to  this  challenge.  Their  modularity  permits  the  designer  to  decompose  the  system’s  a.-chitccture 
into  building  blocks  which  can  be  used  repeddvely  with  simple  interfoces. 

From  the  origin,  the  epithet  systolic  has  been  reserved  to  special-purpose  devices,  such  as  muldpliers, 
priority  queues,  pattcm-matchers,  etc...  With  this  perspeedve,  systolic  arrays  were  built  with  wired-in  cell 
implemcntadons,  which  was  not  to  be  a  handicap  as  long  as  the  overall  reconflgurability  of  the  array,  an 
essential  feature  of  a  systolic  architecture,  was  preserved.  Thus  the  user  was  essentially  given  the  fleedom  to 
tailor  the  array  to  the  size  of  his  problem,  without  having  the  possibility  of  modifying  the  cell  deflnidon.  If 
one  wishes,  however,  to  optimize  the  cell  specifications  or  to  allow  a  more  versatile  use  of  the  systolic  device, 
it  is  essential  that  the  cell  behavior  be  made  programmable  (D].  By  doing  so,  it  becomes  possible  to 
experiment  with  different  systolic  implementations  of  a  same  scheme  without  having  to  build  different  chips 
and  be  caught  in  the  bottleneck  of  fabrication  turnaround.  Also,  programming  the  array  allows  the  user  to 
make  it  fulfill  not  just  one  function,  but  a  whole  range  of  related  tasks.  The  merit  of  this  approach  partly 
resides  in  the  combination  versatility  <£  high-performance  which  it  affords.  It  must  also  be  mentioned  that  it 
serves  pedagogical  purposes  by  putting  systolic  design  into  the  hands  of  the  laymen,  thus  making  the 
conception  and  use  of  very  high  performance  device  more  accessible. 

The  purpose  of  this  work  is  to  present  a  class-related  systolic  processor  based  on  the  approach  just 
described.  This  processor  is  a  programmable  sy^lic  array  aimed  for  solving  a  wide  class  of  geometric 
problems  in  a  highly  unifying  manner.  This  class  of  problems  contains  many  of  the  most  basic  questions  of 


ben  general  exposition  of  sysolic  aichiteauies  can  be  found  in  (Kl]. 
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computational  geometry.  Among  others,  we  will  Rnd  dynamic  versions  of  convex  hull,  inclusion,  range  and 
inverse  range  search,  planar  point  location,  intersection,  trianguiation,  and  closcst-point  problems.  Whenever 
possible,  we  will  insist  on  the  dynamic  aspect  of  the  problem,  for  it  is  often  where  systolic  solutions  are  at  their 
best.  On  the  other  hand,  many  applications  areas  involve  problems  of  an  inherently  dynamic  nature,  with 
which  we  must  cope.  For  example,  air  traffic  control  necessitates  the  real-time  solution  of  closcst-point 
problems  on  an  ever-changing  set  of  points. 

After  discussing  the  advantages  of  systolic  architectures  in  terms  of  increased  adaptability  and  cost- 
effectiveness.  we  should  investigate  the  gains  in  performance  to  expect  from  a  systolic  treatment  of 
computational  geometry.  To  begin  with,  let  us  roughly  describe  our  systolic  architecture.  Wc  consider  only 
one-dimensional  arrays.  Le^  arrays  with  a  single  string  of  cells,  each  connected  to  their  one  or  two  neighbors. 
Furthermore,  communications  with  the  outside  world  (typically,  a  host  computer)  takes  place  solely  at  either 
of  the  end-cells.  It  results  from  this  configuration  that  although  there  may  be  full  parallelism  in  the  arrays,  the 
number  of  I/O  operations  at  any  time  is  always  bounded  by  a  constant  We  do  not  make  this  assumption  for 
the  sake  of  simplicity,  but  for  the  sake  of  realism.  Indeed,  in  most  applications,  the  systolic  device  will  receive 
its  data  from  a  sequential  computer,  therefore  the  assumption  we  are  making  is  not  a  choice  but  an  inevitable 
reality. 

Being  now  ready  to  turn  our  anendon  to  performance  considerations,  we  inunediatcly  derive,  flrom  the 
assumption  above,  that  N  pieces  of  data  cannot  be  processed  in  fewer  than  N  systolic  steps.  This  may  seem 
like  a  serious  handicap,  when  compared  to  the  0(N^)  or  0(Nlog  N)  tunning  times  typically  offered  by 
sequential  geometric  algorithms.  One  may  hope  at  best  the  gain  of  a  factor  N  or  log  N:  however,  asymptotic 
figures  based  on  big-Oh  considerations  arc  not  too  relevant  in  the  matter.  Indeed,  the  sole  performance  goal 
in  our  case  is  to  maximize  the  throughput,  i.c„  have  the  systolic  array  keep  up  as  closely  as  possible  with  the 
host/device  data  raw.  This  data  rate  is  dependent  on  the  pin  bandwidth  of  the  chip,  or  sometimes  in  real-time 
applications,  on  the  raw  at  which  data  is  made  available  to  the  host  by  the  outside  (c.g.,  radar,  sensor).  Now 
that  the  new  emphasis  made  here  reflects  yet  another  departure  from  the  traditional  study  of  computational 
complexity. 

It  is  often  the  case  that  a  circuit  will  receive  streams  of  data,  each  of  them  pertaining  to  a  different  instance 
of  the  problem.  In  this  case,  maximizing  the  throughput  is  called  pipelining,  and  to  measure  the  adequacy  of 
the  circuit  to  respond  to  a  stream  of  requests,  wc  look  at  its  period,  a  concept  introduced  in  [VU].  Roughly, 
the  period  of  a  circuit  is  the  minimum  delay  between  two  consecutive  sets  of  inputs.  Of  course,  it  is  highly 
desirable  that  our  systolic  designs  have  period  0(1).  which  often  involves  preventing  the  ix:curTcnce  of 
clusters  or  of  the  presence  of  cells  waiting  for  others  in  order  m  complete  execution.  Wc  will  discuss  these 
issues  in  detail  later  on. 


Bounduy 

CeU 


Boundaiy 

Cell 


Figure  1:  The  one*diinensioaal  systolic  array. 

2  The  geometric  systolic  chip 

Most  of  the  systolic  arrays  which  we  will  describe  in  this  paper  have  the  basic  outlook  of  fig.L  Interaction 
with  the  outside  worid  takes  place  solely  at  the  aid  cells,  called  boundary  cells.  All  of  the  other  cells,  called 
generic,  are  alike,  and  although  boundary  cells  are  assigned  additional  tasks  for  I/O  purposes,  they  usually 
don’t  difRsr  drastically  from  the  generic  cells.  Each  cell  contains  a  small  amount  of  memory,  in  the  form  of  a 
few  registers.  We  distinguish  two  kinds  of  registen: 

1.  Working  registers  for  either  storing  data  (point,  edge,  angle,...)  or  for  providing  temporary  storage 
for  the  computations. 

1  I/O  registers  for  communicating  data  between  adjacent  ceili 


To  avoid  dealing  with  implementation  details  at  this  point  (we  will  cake  up  these  issues  in  the  appendix),  we 
may  regard  I/O  registers  as  being  conceptually  located”  on  the  connection  wires  between  the  ceils.  These 
registers  ate  protected  by  gates  which  can  be  citiier  open  or  locked  according  to  the  current  clock  phase.  We 
assume  that  the  whole  systolic  array  is  synchronous,  and  that  each  cell  operates  in  lock-step.  For  simplicity,  we 
also  assume  the  existence  of  two  clocks  9^  and  92  beating  in  opposition,  lliis  allows  us  to  separate  input  and 
output  stages  easily  by  requiring  that  input  (resp.  output)  gates  should  all  be  open  (resp.  locked)  at  9^  and 
vice  versa  at  92  (ng.2).  Ibc  lapse  of  time  between  two  phases  9j  is  called  a  systolic  cycle.  It  is  to  be 
distinguished  from  the  clock  cycle  internal  to  each  cell,  which  is  likely  to  be  much  shorter.  Indeed,  a  systolic 
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cycle  must  correspond  at  least  to  a  number  of  internal  clock  cycles  necessary'  for  a  cell  (o  complete  the 
execution  of  its  stored  program.  We  should  observe  that  this  clocking  arrangement  is  not  unique:  systolic 
arrays  with  asynchronous  and/or  adjacent  cells  operating  in  opposite  cycles  are  perfectly  feasible,  so  the 
choice  made  here  serves  only  explanatory  purposes,  wlog. 
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Figure  2:  Handling  critical  paths. 


The  only  hit  of  notation,  used  throughout,  that  needs  be  introduced  here  concerns  the  representation  of 
points  by  capital  leuers,  A,MX....  with  a^  denoting  the  ith  coordinate  of  point  A  in  a  Cartesian  system  of 
coordinates. 


3  Convex  hull  problems 

Estimating  a  population  parameter  in  statistics,  or  simulating  chemical  reactions  often  require  computing 
the  convex  hull  of  a  set  of  points  in  a  dynamic  l^ion  [S].  In  the  Ibnner  case,  one  wishes  to  strip  away  the 
convex  hull  of  the  set  of  points  to  remove  the  outliers  of  the  sample,  then  remove  the  convex  hull  of  the 
remainder,  and  iterate  on  this  process  until  only  (l-2a)N  points  remain  (N  and  a  arc  respectively  the  size  of 
the  sample  and  a  chosen  trimming  factor).  This  leads  to  the  definition  of  the  depth  of  a  point  as  the  number  of 
convex  hulls  that  have  to  be  stripped  from  the  sample  until  the  point  is  removed.  For  static  and  dynamic 
solutions  to  convex  hull  problems  on  a  conventional  machine,  sec  [S.Pl,LE,J,OV]. 

To  fulfill  our  purposes,  we  will  devise  a  systolic  structure  which  supports  the  following  operations^. 

1.  Inscrt/dclctc  point  M. 

2.  Find  and  report  all  the  vertices  of  the  convex  hull  in  clockwise  or  counterclockwise  order. 


^'hroughout  ihM  section,  we  will  assume  the  dimeasion  of  the  space  to  be  1 
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3.  Determine  whether  an  arbitrary  point  M  lies  inside  or  outside  the  convex  hull 

As  usual  with  dynamic  convex  hull  routines,  deletions  and  insertions  proceed  in  very  difTcrent  ways.  To 
cope  with  this  problem,  we  will  describe  two  systolic  arrays,  CHI  and  CH2,  supporting  the  following 
operations. 

Array  CHI 

1.  Inscn/delete  point  M. 

2.  Report  all  vertices  of  convex  hull  (in  arbitrary  order). 

3.  Determine  whether  point  M  lies  inside  or  outside  the  convex  hull 

Array  CH2 

1.  Insert  point  M. 

2.  Repon  all  vertices  of  convex  hull  in  clockwise  (or  counterclockwise)  order. 

3.  Determine  whether  point  M  lies  inside  or  outside  the  convex  hulL 

We  observe  that  in  order  to  support  the  operations  listed  at  the  beginning,  it  suffices  to  connect  CHI  and 
CH2  together. 

3.1  Th«  array  CHI 

CHI  consists  of  N  cells,  so  as  to  handle  up  to  N  points  at  any  given  time,  each  cell  storing  one  point  All 
operations  (updates  and  queries)  are  initiated  at  the  input  cell  with  the  answers  emanating  from  the  output 
cel](figj). 


'  HOST 


HOST  ^  ' 


Figure  3:  The  overall  structure  of  the  array  CHI. 


Implementing  Operation  1  is  straightforward.  Points  to  be  inserted  arc  pumped  into  the  left  cell,  and  travel 


7 


from  left  to  right  stppping  at  the  first  vacant  cell.  A  point  to  bo  deleted  is  input  in  the  same  way,  moving  from 
left  to  right  until  it  encounters  the  cell  where  its  copy  is  stored,  which  it  then  marks  as  vacant.  Mote  that  the 
array  docs  not  keep  track  of  the  order  of  the  vertices  around  the  convex  hull.  Operation  3  relics  on  the 
following  geometric  property. 

Lemma  1:  Let  be  a  list  of  N  points  in  the  order  induced  by  an  angular  sweep 

around  a  point  M.  This  point  lies  inside  the  convex  hull  of  if  and  only  if  no  angle  of 

the  form  p  [mod  N]  exceeds  180  degrees. 

Proof:  A  consequence  of  the  fact  that  a  point  lies  outside  the  convex  hull  iff  there  exists  a  line 
containing  it.  vrith  all  the  points  on  one  side  of  the  convex  hull  □ 


F(AXB) 

RBjiiu) 

FIAXJ) 

RBMA) 


a  FiCM3) 
»  RCAI.A) 

«  RCM3) 
¥  HCM3) 


RBMA) 


^  RCM3) 
»  F(aM,A) 


^  RCAIJ) 
RBAU)  ^  F(CM,A) 


Figure  4:  Testing  inclusion  in  the  convex  hull. 


T:  unchanged 


T  <—  (M.CB) 


T  (MAO 


Mis  inside 
convex  hull 


Lemma  1  shows  that  we  simply  have  to  make  tite  query  point  M  travel  fi'om  left  to  right  maintaining  the 
value  of  the  largest  angle  (MjMM.^  P  encountered  so  far.  This  is  done  by  a  uivial  case  analysis,  illustrated  in 
fig.4.  To  alleviate  the  notation,  we  define  F(M.A,B)  as  the  sign  of  the  expression  um^+vm^+w.  where 


uX+vY+w=0  is  an  equation  of  the  line  passing  through  A  and  ITiis  provides  us  with  an  easy 
characterization  of  whether  two  points  M,P  lie  on  the  same  side  of  AB.  i.c..  they  do  iff  F(M,A,B)= F(P,A,B). 
For  simplicity,  we  will  always  assume  that  no  three  points  are  ever  coilincarl  Let  T=(M.A.B)  be  the  triplet 
of  points  yielding  the  largest  angle  so  far.  M  will  travel  along  with  diis  piece  of  information,  which  must  be 
tested  against  each  new  point  encountered,  then  updated  before  proceeding  to  another  cell  Testing  T  against 
a  new  point  C  leads  to  the  operations  described  in  fig.4. 

The  handling  of  Operation  3  should  be  clear  by  now,  so  can  proceed  with  Operation  2.  One  solution  would 
be,  in  a  first  stage,  to  output  copies  of  all  the  points,  then  in  a  second  stage,  re-input  them  one  after  the  other, 
while  ekccuting  Operation  3.  To  achieve  the  same  result  in  place,  we  can  view  the  systolic  array  as  a  strip  of 
paper.  The  idea  is  then  to  pick  it  up  at  the  input  cell  end  and  fold  it  over,  pulling  the  input  cell  over  from  left 
to  right  (figi). 


Figure  5:  The  fold-over  operation. 


To  ensure  that  each  cell  will  indeed  look  at  ail  the  others,  we  must  update  both  the  covering  cells  moving 
right  and  the  covered  cells  not  yet  in  motion.  The  updating  is  of  the  same  nature  as  in  Operation  3.  Note  that 
the  left  end  of  the  folded  strip  will  move  twice  as  slowly  as  the  input  cell.  For  this  reason,  no  operation  on  the 
systolic  array  should  be  initiated  within  N  systolic  cycles  after  the  start  of  Operation  \  litis  will  ensure  that  no 
query  will  ever  propagate  to  a  ceil  already  engaged  in  a  computation  for  a  previous  query.  To  implement  this 

^Wc have u  »  a2"*’2'  ~  *l*’2”*2*’r 

4 

Relaxing  this  requirement  involves  adding  only  a  Tew  simple,  uninteresting  details  to  the  algorithms,  so  it  is  legitimate  to  allow  such 
simpliricationi 
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yb/cf-over  operation,  we  need  essentially  two  signals:  one  is  the  query  itself,  which  follows  the  right-end  of  the 
covering  strip.  The  other  follows  the  other  end.  and  is  necessary  to  signal  the  cell  tliat  at  the  cycle  following 
the  next  it  will  have  to  send  a  copy  of  itself  to  the  right,  thus  becoming  the  current  left  front  of  the  covering 
strip.  See  Appendix  for  details. 

3.2  Th«  array  CH2 

This  structure  supports  only  insertions,  but  in  return,  it  provides  an  ordered  description  of  the  convex  hull 
at  any  time.  Also,  since  the  array  stores  only  the  vertices  of  the  convex  hull  it  can  support  an  arbitrary 
number  of  insertions,  as  long  as  this  convex  hull  always  keeps  a  number  of  vertices  on  the  order  of  N.  To 
begin  with,  let  us  give  the  geometric  background  behind  the  algorithm.  Assume  that  arc  die 

vertices  of  a  convex  p-gon  P,  given  in  clockwise  order.  Let  M  be  an  arbitrary  point  outside  P,  and  let  Q  denote 
the  convex  hull  of  Pu{M}.  Considering  the  infinite  line  passing  through  an  edge  e  of  P.  it  is  easy  to  see  that 
adding  M  to  the  convex  hull  will  cause  the  disappearance  of  e  if  and  only  if  the  line  lies  between  M  and 
P.  This  motivates  die  introduction  of  the  function  G,  defined  by  the  relation: 

G(M.A,D)  =  (aj— b2>nj+(bj-aj)m2+ajb2-a2bj 

Note  that  F(M  AB)  =  sign  G(M.A3).  The  following  result  is  simply  a  more  formal  statement  of  the  remark 
above,  and  we  leave  out  the  proof  *  see  illustration  in  fig.6.  Once  again,  in  the  following,  we  shall  assume  that 
no  three  points  may  be  coUinear. 


Figure  6:  Computing  convex  hulls  in  clockwise  order. 
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Lemma  2:  Let  be  the  vertices  of  a  convex  p-gon  P.  in  clockwise  order.  Lei  M  be  an 

arbitrary  point  and  Q  denote  the  convex  hull  of  Pu{M}. 

1.  M  lies  inside  P  iff  G(M.Mj,M.^j)<0.  for  all  i;  1  ftnod  p]. 

2.  MjM;^  j  is  an  edge  of  Q  iff  G(M.Mj.Mj^ ^KO.  Also,  if  M  docs  not  lie  inside  P.  it  is  a  vertex 
of  Q  and  its  adjacent  vertices  are.  in  clockwise  order.  and  M^.  defined  uniquely 
by  G(M^.j.M^>I)<0,  G(M„^i.M„>!K0.  G(M^j.M.M^)<0.  and  G(M^^  i,M.M^0. 

The  array  CH2  has  the  same  overall  structure  as  CHI  (figj).  Instead  of  a  point,  each  cell  now  stores  an 
edge  of  the  convex  hull  however,  and  the  lcft-to*right  order  in  the  array  corresponds  to  a  clockwise  traversal 
of  tlie  boundary  of  the  convex  hull  Operation  1  (inserting  point  A/)  causes  M  to  travel  from  the  input  ceil  to 
the  output  cell  computing  the  function  G  defined  above  in  order  to  determine  whether  M  lies  inside  the 
convex  hull.  If  it  lies  outside,  two  edges  have  to  be  added  to  the  structure,  and  in  general  a  bunch  of 
consecutive  edges  (at  least  one.  anyhow)  must  be  removed.  More  precisely,  assume  that 
are  the  consecutive  edges  of  P  to  be  removed.  Upon  encountering  M  must  cause  the  cell  currently 

visited  to  substitute  MjM  for  MjM.^  All  the  subsequent  cells  will  delete  their  contents,  until  M  encounters 
the  first  edge  (MjMj^  ^  not  to  be  afiected  by  the  insertion  of  M.  At  this  point,  the  current  cell  must  hand  the 
cell  to  its  right-hand  side  neighbor,  and  keep  the  edge  MMj  in  store.  M  has  now  ceased  to  cause 

changes  in  the  array,  and  it  can  terminate  its  motion.  However,  there  is  now  one  cell  in  the  array  with  two 
edges.  To  repair  this  anomaly,  we  make  sure  that  the  cell  keeps  its  additional  edge  but  forward  its  former 
contents  to  its  ri^t  neighbor.  This  only  causes  to  shift  the  anomaly  one  cell  to  the  righi  but  iterating  on  this 
process  will  eventually  cause  the  last  non-vacant  cell  to  release  an  edge  to  its  neighbor,  which  solves  the 
problem.  This  phenomenon  is  known  as  rippling,  as  it  mimics  the  propagation  of  a  wave  in  water.  We  should 
observe  that  if  the  last  non-vacant  cell  has  no  right  neighbor,  overflow  must  be  reported.  However,  the 
insertion  may  have  just  cause  the  deletion  of  a  number  of  edges,  in  which  case  reporting  overflow  is 
undesirable.  In  general,  we  pose  as  a  requirement  that  no  overflow  should  be  reported  if  there  is  any  vacant  cell 
in  the  array,  no  matter  where.  To  comply  with  this  rule,  we  must  ensure  that  vacant  cells  which  have  edges  on 
their  right-hand  side,  le..  holes,  must  be  filled  hy  edges  iirom  the  right.  To  do  so.  it  suffices  to  have  each  cell 
always  check  whether  its  left  hand-side  neighbor  is  vacani  in  which  case  it  must  pass  its  contents  to  iL  As  a 
result,  it  appears  that,  in  general  two  opposite  motions  will  take  place  within  the  array:  one,  to  the  right, 
corresponds  to  queries  and  insertions,  while  the  other,  leftwards,  is  meant  to  fill  the  holes  just  created. 
Operation  2  simply  involves  pumping  out  all  the  edges  of  the  array  through  the  input  cell  thus  preserving  the 
(counterclockwise)  order  of  the  edges.  Operation  3  is  a  simple  application  of  Lemma  2.  similar  to  Operation  1. 
yet  without  altering  the  state  of  the  array.  The  query  point  M  uavcls  Ich-to-right.  checking  its  location  with 
respect  to  each  edge  in  turn.  If  M  is  always  found  to  lie  on  the  same  side  of  the  edge  os  the  interior  of  the 
polygon,  inclusion  must  be  reported,  otitcrwisc  M  lies  outside  the  convex  hull.  See  Appendix  for  details. 
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4  Inclusion,  Intersection,  and  Closest-point  problems 

We  next  show  that  many  of  the  most  common  geometric  problems  can  be  solved  by  means  of  a  simple 
unifying  scheme.  The  underlying  idea,  already  used  in  arithmetic  or  pattern  matching  [K.FK,KL].  exploits 
the  inherent  suitability  of  systolic  designs  to  testing  each  input  data  against  the  contents  of  each  cell  in  a 
pipeline  fashion. 

More  precisely,  let  S^....,S^  be  the  data  stored  in  die  array,  and  let  denote  a  list  of  queries  in  the 
order  with  which  they  arrive  at  the  input  ceil:  the  goal  is  to  compute  for  each  query  the  value  of 
defined  by  the  recurrence  relation:  =  0, 
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Figure  7:  A  systolic  scheme  for  iterative  problems. 


Figure  7  sketches  a  systolic  solution  for  this  class  of  problems.  As  we  will  sec.  it  is  possible,  in  most  cases,  to 
make  the  systolic  scheme  dynamic,  that  is,  capable  of  handling  updates  in  the  array.  If  no  order  among  the  S.’s 
is  required,  a  delete  (e)  operation  simply  results  in  marking  the  cell  storing  e  vacant  while  insert  (e)  causes  the 
storing  of  e  in  the  first  cell  vacant  from  the  left  if  on  the  other  hand,  some  order  is  to  be  preserved  among  the 
S.’s,  an  insert  operation  will  involve  searching  for  the  appropriate  (non-neccssarily  vacant)  cell,  and  store  the 
new  element  in  it.  thus  possibly  causing  the  remaining  cells  to  ripple  to  the  right.  Symmetrically,  deleting  an 
clement  will  incur  the  creation  of  a  hole  and  the  start  of  a  leftward  motion  aimed  at  filling  it.  resulting  in  the 
propagation  of  the  hole  to  the  right  end  of  the  array. 
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For  a  list  of  applications  areas  where  the  geometric  problems  addressed  next  arise  in  practice.  Shamos’ 
thesis  [S]  is  the  first  source  to  turn  to. 

4.1  Inclusion  problems 
1)  Point  /  Polygon 

Does  point  M  lie  in  polygon  P  ~ 

The  polygon  is  taken  to  be  simple^  but  no  convexity  assumptions  are  made.  It  is  possible  to  achieve  unit 
period  with  the  following  systolic  scheme.  The  register  Sj  holds  the  pair  where  the  list 

corresponds  to  a  clockwise  traversal  of  the  boundary  of  P.  The  variables  x  and  y  of  (Ig.?  arc  respectively  the 
point  M  and  the  pair  (uv.uV),  where  uv  and  uV  are  die  edges  of  P  with  u.v  (resp.  u'.v’)  giving  the  clockwise 
direction,  such  that  their  intersection  with  the  vertical  line  L  passing  through  M  forms  the  smallest  segment  so 
far  containing  M  (flg.8).  Testing  for  the  inclusion  of  point  M  involves  pumping  M  throughout  the  array,  from 
left  to  right  updating  the  pair  of  edges  in  y  on  the  fiy. 


Figures:  Testing  inclusion. 


potygon  is  simple  if  no  pair  of  non*adijacent  edges  intenccL 
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Let  =  (AA*.BB*),  with  =  LnAA’  and  Dg  =  LnBB’. 
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If  D^lresp.Dg)  is  undefined,  it  can  be  set  CO  infinity  (re^.  -infinity),  for  convenience. 

Eventually,  the  array  can  output  an  inclusion  message  if  y^  falls  into  case  b)  of  fig.8.  or  a  non-inclusion 
signal  if  it  falls  into  case  cX  This  is  a  simple  application  of  the  Jordan  Curve  Theorem,  stadng  that  a  closed 
curve  in  the  plane  divides  die  plane  into  two  parts:  the  inside  and  die  outside.  Note  that  the  scheme  used 
above  is  far  ftom  unique,  and  other  tests  for  inclusion  may  lead  to  equally  simple  systolic  structures.  For 
example,  simply  counting  die  number  of  mtersections  with  the  line  L  above  and  below  M  is  sufficient,  since 
these  numbers  are  even  ifFM  lies  inside  die  polygon. 


2)  Planar  point  location 

Given  a  planar  paph  mth  faces fp.-.J'ff  ondapointM,  determine  the  fixe  whereMIiex 

For  this  problem,  several  sequential  algorithms  with  an  optimal  0(log  N)  query  time  exist  [S,LTP21,  but 
for  the  most  part  requite  complicated  preprocessing.  Instead,  we  can  design  a  very  simple  systolic  array  to 
solve  this  problem  with  unit  period.  To  do  so,  we  simply  represent  the  graph  by  placing  in  the  array,  next  to 
each  other,  clockwise  descriptions  of  the  foces.  Since  in  this  way,  each  edge  is  represented  exaedy  twice,  and 
since  the  total  number  of  edges  of  a  planar  graph  does  not  exceed  the  number  of  faces,  up  to  within  a  constant 
foctor,  no  more  than  a  linear  number  of  cells  will  be  required.  We  can  now  view  the  graph  as  a  union  of 
polygons,  represented  in  the  array  by  consecudve  sublists  of  edges.  Locating  a  query  point  M  comes  down  to 
testing  the  point  for  inclusion  with  respea  to  each  polygon  in  turn,  as  previously  described,  finally  concluding 
with  a  report  of  the  name  of  the  unique  polygon  whidi  contains  M. 


3)  Range  search 

In  one  dimension,  the  probletn  consists  of  computing  the  number  of  segments  containing  a  query 
point,  given  a  set  of  N  collinear  segments  In  two  dimensions  the  goal  is  to  report  the  number  of 
rectangles  containing  a  query  point,  given  a  set  of  N  iso- rectangles  (sides  parallel  to  the  X-Y’oxes) 
IBW.B0.NP.E,MJ. 

The  systolic  array  will  simply  store  one  segment  (resp.  rectangle)  in  each  cell,  so  that  the  query  point  can 
scan  die  array  left*to*right.  checking  for  inclusion  in  the  segment  (resp.  rectangle)  stored  in  each  cell,  and 
updating  the  partial  count.  Note  that  the  problem  can  be  extended  to  arbitrary  polygons  instead  of  only 
iso*rectang)es. 

4)  Inverse  range  search 

Given  a  set  of  segments  (resp.  rectangles),  and  given  a  query  segment  (or  a  query  rectangle),  report 
the  number  of  segments  (resp.  rectangles)  that  intersect  the  query  object  [BW,BO,NP,^M]. 

Once  again,  testing  pairwise  intersection  requires  constant  time,  which  ensures  unit  period.  The  algorithm 
is  straightforward  and  needs  no  Airther  developmenL 

The  last  two  problems  arise  constantly  in  graphics  [NS],  and  in  design*rulc  checking  for  VLSI  circuits 
(BO.BW],  Often,  however,  instead  of  a  mere  number  of  intersections,  an  explicit  report  of  all  the  intersecting 
pairs  is  desired.  To  give  our  systolic  arrays  this  added  capability,  it  is  sufficient  to  add  only  a  few  instructions 
to  the  algorithms.  One  solution  is  to  prescribe  diat  upon  encountering  an  intersection,  a  query  first  sends  the 
intersecting  pair  forward  to  the  next  cell,  then  only  proceeds  in  the  same  direction.  Of  course,  this  will  cause  a 
slowdown,  therefore  to  prevent  overtaking  by  subsequent  queriei  we  require  that  before  moving  an  object  to 
the  next  ceil,  the  algorithm  first  check  the  vacancy  of  that  cell.  To  that  end.  each  cell  must  keep  sending  vacant 
or  occurred  signals  to  its  left  hand*side  neighbor.  The  scheme  is  somewhat  similar  to  the  traffic  management 
described  for  CHI  so  we  refer  to  the  appendix  for  details.  We  should  observe  that  with  the  actual  reporting 
of  intersecting  pairs,  the  array  sdll  yields  maxiirud  throughput  since  the  output  flow  is  always  kept  at  its 
maximum.  The  concept  of  period,  based  on  input  rate,  becomes  meaningless,  however,  since  a  glut  due  to 
intense  output  activity  may  cause  a  slowdown  in  the  input  rate. 

4.2  Intersection  problems 

For  sequential  algorithms,  see  [S,SH.BW,BO,NP]. 

5)  Intersection  of  polygons 

Given  two  polygons  P.Q.  determine  whether  they  intersect 

If  we  wish  to  determine  only  if  the  boundaries  intersect,  we  may  simply  store  the  edges  of  P  in  the  systolic 
array,  and  have  those  of  Q  travel  Icft-to-right,  testing  each  edge  encountered  for  intersection.  It  is  easy  to 


extend  the  method  and  solve  the  general  problem  by  observing  that  P  and  Q  intersect  if  and  only  if  at  least 
one  of  the  following  conditions  is  satisfied: 

1.  A  vertex  of  P  lies  in  Q. 

1 A  vertex  of  Q  lies  in  P. 

3.  The  boundaries  of  P  and  Q  intersecL 

Thus  it  suffices  to  add  to  each  cell  two  copies  of  the  procedure  described  for  Problem  1);  one  with  respea 
to  P,  the  other  with  respect  to  Q.  Note  that  each  ceil  must  check  whether  the  passing  edge  is  the  last  edge  of 
Q,  in  which  case,  it  must  tag  a  yes  or  no  signal  to  the  tail  of  Q  to  acknowledge  if  cither  endpoint  of  the  ct^c 
stored  in  the  ceil  lies  inside  Q  or  not  This  is  straightforward,  and  details  are  left  to  the  attention  of  the  reader. 

6)  Intersection  of  halPplancs  ' 

Given  N  half  planes  compute  their  mtenectitm. 

This  problem  requires  fi(Ntog  N)  time  on  a  conventional  machine  [S.SH.B}.  As  usual,  we  expect  our 
systolic  implementadon  to  yield  maximal  throughput  and  thus  display  an  overall  0(N)  dme  perfonnance. 
Moreover,  as  we  will  see,  it  is  easy  to  provide  the  array  with  the  capability  of  handling  queries  and  updatei 
without  losing  on  the  overall  pcrfbnnance.  This  addition  is  very  similar  to  the  connection  of  CHI  and  CH2 
described  earlier  for  the  solution  of  dynamic  convex  hull  problems.  Actually,  the  similarity  between  the  two 
problems  is  very  deep.  Bk  it  stems  from  the  geometric  duality  which  exists  between  convex  hulls  and 
intersections  of  halFspaces  [BJ’M]. 

Let  I  be  die  intersection  of  die  N  half-planes  If  I  is  not  empty,  it  is  a  convex  polygon  with 

possibly  one  open  side,  le„  two  edges  that  are  halMines  meeting  at  infinity^.  It  is  possible  to  represent  I 
either  by  a  list  of  the  lines  supporting  the  edges  of  I,  in  arbitrary  order,  or  if  we  wish  more  information,  by  a 
list  L  of  edges  (A.B),  as  they  appear  in  a  clockwise  traversal  of  the  boundary.  In  case  of  an  open  polygon  I,  we 
require  that  the  vertex  at  infinity  should  appear  at  the  ends  of  the  list  For  example,  we  may  have  two  points 
I^,  I^,  in  the  list 

L  =  {(Ii,Aj).(AjA2)...4A^.y  } 

with  the  undersunding  that  the  edge  I^A^  (resp.  A^I^)  is  the  infinite  ray  starting  at  A^  (resp.  A^)  and  passing 
through  IjA^  (resp.  A^y. 


^oie.  Tor  the  sake  of  compictencs.  that  the  intcncaion  I  may  aiso  be  reduced  (o  a  single  half-plaae  or  an  inlinitc  parallel  strip. 


Similarly  to  CHI  and  CII2.  we  will  design  two  systolic  arrays  INTI  and  INT2  to  support  the  following 
operations: 

Array  INTI 

1.  Insert/delete  half-plane  H. 

1  Report  all  lines  on  the  boundary  of  I.  in  arbitrary  order. 

3.  Octennine  whether  point  M  lies  in  L 


Array  INT2 


1.  Insert  half-plane  H. 

1  Report  all  yertkes  of  I  in  clockwise  (or  counterclockwise)  order. 
3.  Determine  whether  point  M  lies  in  L 


Because  of  the  similarity  with  CHI  and  CH2,  we  may  only  sketch  the  algorithms.  Any  standard 
representation  of  half-planes  is  adequate.  For  example.  (u.y,w.>)  can  be  used  to  denote  the  half-plane 

uX  -K  vY  +  w  i  0. 

The  only  point  to  inyesdgate  about  INTI  is,  in  Operation  2.  the  type  of  matching  involved  in  the  "fold- 
over”  process.  To  begin  with,  it  is  easy  to  see  that  a  half-plane  H,  contributes  an  edge  to  I  iff  its  supporting 
line  Lj  lies  in  the  intersection  of  the  N-1  remaining  halFplanes  ^ 

intersection  of  with  the  intersection  of  any  subset  of  is.  if  not  empty,  a  segment,  a  half-line,  or 

itself,  it  can  be  expressed  by  means  of  at  most  two  points,  which  can  then  be  updated  as  L.^  is  matched  against 
each  Hj  in  turn.  All  of  the  other  features  of  INTI  are  similar  to  those  of  CHI.  As  for  INT2.  we  assume  that, 
at  all  times,  the  array  contains  a  clockwise  description  of  I,  with  each  edge  stored  in  a  separate  cell  Once 
again,  all  the  operations  are  handled  as  in  CH2,  including  the  hole-filling  process;  only  the  case  analysis  for 
Operation  1.  the  center-piece  of  the  algorithm,  needs  to  be  detailed,  which  is  done  in  the  appendix. 

4.3  Cloaast-point  probUms 
7)  Nearest-neighbor 

Given  N  points.  and  a  query  point  M,  determine  the  nearest  neighbor  o/  A/  -  see 

[S.BSW.BWY1. 

For  this  problem,  we  allow  the  dimension  of  the  space  to  be  arbitrary  and  the  distance  to  be  based  on  any 
of  the  L^.  L2,....L^  norms^.  Whereas  cfTIcicnt  solutions  on  a  conventional  machine  involve  the  use  of  fancy 


^Recall  Uiat  the  Lp-nonn  of  a  vcaor  (x  j...* in  a  Hudidcan  d-spacc  is  (|x  + ... + 1x^1**]^'’**. 
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data  structures  (e.g.,  Voronoi  diagrams,  planar  point  location  search  trees,  k’J  trees,  etc...)  entailing  substantial 
implementation  overhead,  a  simple  dynamic  systolic  scheme  can  be  devised  as  follows: 

Once  again,  we  store  one  point  per  cell.  Queries  travel  Icft-to-right,  determining  their  nearest  neighbor  on 
the  fly.  To  do  so,  each  query  is  accompanied  by  the  the  closest  point  found  so  far.  Updates  in  the  structure 
are  handled  as  in  CHI,  that  is,  inserting  a  point  into  the  first  available  cell  encountered,  and  deleting  it  by 
simply  marking  the  corresponding  cel!  vacant  If  desired,  a  report^all-nearest-ncighbors  query  can  be  added  to 
the  set  of  allowed  operations.  This  instruction,  which  causes  the  nearest  neighbor  of  each  point  in  the  array  to 
be  output  can  be  implemented  by  the  ^/d-overproccdure  of  CHI.  See  Appendix  for  details. 

Applications  areas  where  a  device  for  reporting  near^neighbors  would  be  of  great  interest  are  many.  Air 
traffic  control  is  one  example:  in  this  situation,  typically,  a  few  radars  transmit  streams  of  signals  giving 
updates  on  the  position  of  near-by  airplanes,  and  minimum  safety  distances  between  planes  must  be 
constantly  ensured.  To  speed  up  the  signaling  of  anomalous  positions,  an  emergency  output  port  can  be 
reserved  on  each  cell,  with  direct  link  to  the  host  Although  slightly  unsystolic,  this  feature  is  totally  feasible  as 
long  as  emergency  reports  remain  rare  events. 

8)  Eiiclidcaa  minimmn  spanning  tree 

Given  .V  points  in  the  pUme.  construct  a  tree  of  minimum  total  length  whose  vertices  are  the  given 
pointsfSJ. 

C.  Savage,  in  [SA],  proposes  a  systolic  structure  for  computing  the  connected  components  of  a  graph.  This 
structure  is  a  one*dimensional  systolic  array,  which  can  be  connected  to  Lciserson’s  priority  queue  [L],  so  as  to 
compute  the  minimum  spanning  tree  in  linear  time. 

9)  Triangvlation 

Partition  the  convex  hull  of  N  points  into  triangles,  using  only  segments  between  the 

points. 

This  problem,  which  arises  fifequcntly  in  numerical  analysis  {finite  element  method,  numerical 
interpolations,  etc,,,),  has  an  QfNlog  N)  lower  bound  on  a  sequential  machine  [S].  A  onc*dimcnsional  systolic 
scheme  can  yet  achieve  linear  time,  while  supporting  the  following  features. 
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Array  TRI 

1.  Insert  a  point  in  the  triangulation. 

1  Determine  in  which  ^e  of  the  triangulation  a  query  point  lies. 

3.  Report  all  the  triangles  of  the  trianguiation  by  giving,  for  each,  a  clockwise  order  list  of  its  vertices. 


The  array  TRI  computes  an  arbitrary  trianguiation.  without  any  consideration  of  "goodness”.  Since  in 
many  cascs.  howevcr.  it  is  crucial  that  certain  quality  criteria  are  met  e.g,  minimizing  a  function  of  the  edges, 
the  array  might  be  used  more  advantageously  within  die  framework  of  a  more  complicated  heuristic.  Each 
occupied  cell  may  serve  one  of  two  purposes:  either  it  stores  an  edge  of  the  convex  hull  (R  s:(A,B))  with  A,B 
giving  the  clockwise  orientation,  or  it  stores  the  vertices  of  a  triangle  in  clockwise  order.  We  also  require  that 
from  left  to  right  the  edges  stored  in  the  cells  of  the  first  kind  should  appear  in  clockwise  order  (fig.9).  Finally 
we  assume  the  existence  of  a  flag  F  to  signal  the  first  edge  of  cither  the  upper  or  the  lower  chain  *  see 
description  of  CH2  in  the  appendix  for  mote  details.  With  this  arrangement  Operation  2  simply  involves 
testing  the  query  point  against  each  triangle,  carrying  the  containing  triangle  along  with  M,  when  detected  (if 
ever),  otherwise  reporting  an  ouisidefaee  message,  if  no  such  triangle  has  been  found.  Yet  simpler.  Operation 
3  involves  pumping  out  the  contents  of  each  cell  storing  a  triangle,  one  by  one  -  see  report  operation  for  CH2 
in  the  appendix. 

To  handle  Operation  1,  two  cases  must  be  considered: 

1.  M  lies  inside  a  triangle  (e.g.,  DFC  in  fig.9).  We  must  replace  R  by.  say,  MCD,  and  insert  the 
triangles  MDF  and  MFC  into  the  next  two  right  neighbors  of  the  current  cell.  This  is  done  by 
rippling  to  the  right  (flg.9,10  *  case  1). 

1 M  lies  outside  the  convex  hull,  and  thus  wiO  become  a  vertex  of  the  new  convex  hull.  The 
algorithm  is  very  similar  to  Gi2.  Instead  of  deleting  non*convex*hulI  edges,  however,  we  must 
now  insert  new  triangles  into  the  array.  Referring  to  fig.l3,  with  AB  being  the  edge  currently 
examined,  and  C.A,B  occurring  in  clockwise  order  around  the  convex  hull  we  can  give  the  new 
case  analysis.  See  example  in  fig.10 '  case  2. 

1)  Delete  AB,  add  AM  and  MBA. 

2)  No  action. 

3)  Delete  AB,  add  MBA. 

4) AddMA. 

Remark:  to  read  after  the  technical  part  for  CH2  given  in  the  appendix.  Note  that,  instead  of  one  possible 
add  in  CH2  in  the  course  of  an  insertion,  we  may  now  have  a  total  of  3  add  operations.  'Ihus,  to  avoid  having 


Figure  9:  Continued  on  next  page  .J.^ 

requests  overtaking  one  another,  we  should  add  a  delay  of  two  more  systolic  cycles  between  successive 
requests,  as  compared  to  CH2.  In  consequence,  a  delay  of  9  idle  cycles  between  requests  is  certainly  a  safe 
scheduling,  lliis  margin  of  safety  is  actually  overly  conservative,  and  there  is  ample  room  for  optimization. 
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5  Conclusions 


The  purpose  of  this  work  has  been  to  present  systolic  designs  for  several  geometric  problems.  Most  of  the 
algorithms  described  in  this  paper  involve  two  distinct  types  of  tasks.  One  is  concerned  with  the  actual 
computation  of  geometric  functions,  and  is,  in  general,  the  easier  to  understand.  The  other  involves  initiating 
and  granting  requests,  which  entails  moving  data  around,  i.c.,  adding  new  items  into  the  array  or  filling  holes 
created  by  deletions.  In  general,  the  flow  of  data  is  irregular  and  not  predetermined,  since  it  is 
contentS'depcndenL  With  the  exception  of  priority  queues  and  similar  structures  [CUL],  this  constitutes  a 
major  departure  from  most  systolic  arrays  described  in  the  literature,  especially  those  for  arithmetic 
computations  [KL.K,FK].  Instead,  most  of  the  known  systolic  structures  have  a  fixed,  predetermined  data 
flow,  usually  highly  regular.  One  major  difficulty  with  random  motion  is  the  absence  of  adequate  tools  for 
proving  the  correctness  of  the  algorithms,  and  in  particular,  describing  the  behavior  of  tlie  data  flow.  There 
certainly  lie  promising  avenues  of  research. 

In  practice,  most  of  the  algorithms  given  here  should  undergo  substantial  revising  before  being 
implemented,  so  as  to  take  into  account  the  opportunities  for  local  optimization  granted  by  the  particular 
applications  for  which  the  device  is  intended.  Also,  the  current  state  of  VLSI  technology  certainly  imposes 
definite  constraints  which  are  bound  to  influence  the  overall  design.  For  example,  the  pin/bandwidth 
limitation  of  today’s  chips,  rightly  seen  by  many  as  the  major  bottleneck,  can  be  partly  overcome  by  clustering 
several  cells  onto  a  single  chip.  Also,  one  highly  desirable  feature  of  a  systolic  array  is  that  it  is  computation* 
bounded  and  not  I/O-bounded  pCl].  This  amounts  in  practice  to  ensure  that  the  cells  do  not  spend  most  of 
the  time  idle,  waiting  for  inputs  to  come.  As  it  is.  it  is  doubtful  that  this  could  be  the  case  with  the  algorithms 
given  here,  since  executing  the  microcode,  alone,  is  most  likely  to  take  longer  than  completing  any  I/O 
operation.  At  any  rate,  it  is  always  possible  to  circumvent  this  difficulty  by  providing  each  cell  with  a  small 
random  access  memory  (perhaps  »  1-2K  with  present  NMOS  technology),  and  simulating  a  few  tens  of  cells 
sequentially  with  a  single  processor.  This  solution  also  has  the  advantage  of  making  the  handling  of  very  large 
inputs  possible,  without  requiring  an  excessive  number  of  chips,  hence  partly  overcoming  the  inter-chip 
communication  bottleneck.  This  may  seem,  of  course,  like  an  oven  denial  of  the  systolic  philosophy, 
however,  the  presence  of  many  cells  (~  100)  within  the  array  will  largely  preserve  the  systolic  nature  of  the 
overall  structure,  as  well  as  its  benefits. 

At  the  implementation  level,  we  urge  to  stay  away  from  floating-point  representations,  whenever  possible, 
because  of  the  inevitable  complications  which  they  entail.  Note  that  in  all  the  algorithms  given  above,  only 
fixed-point  additions,  subtractions,  and  multiplications  arc  needed,  with  the  exception  of  the  intersection 


algorithms,  which  involve  the  solution  of  linear  equations.  In  this  case,  division  is  needed,  yet  can  be  avoided, 
if  rational  numbers  are  kept  as  pairs  of  fi.xcd-point  numbers,  as  is  common  practice  in  linear  programming. 
Wc  should  also  observe  that  the  arithmetic  computations  involved  in  the  algorithms  are  in  general  very  simple 
and  limited,  most  of  them  consisting  of  simple  fixed-point  inner  products. 

Future  work  in  the  area  of  systolic  algorithms  includes,  of  course,  their  actual  implementation  and 
evaluation.  Also,  any  attempt  at  classifying  the  problems  that  lend  themselves  to  systolic  implementations 
appears  very  worthwhile.  Finally,  we  must  once  again  emphasize  the  current  need  for  an  original  description 
language  for  systolic  systems,  as  well  as  new  tools  for  studying  the  behavior  and  proving  the  correctness  of  the 
underlying  algorithms. 
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Appendix 

I.  The  Algorithm  for  CH1 


1.  The  input  cell 

As  illustrated  in  flg.ll.  the  input  cell  has  S  variables  attached  to  it 


Fieurc  11;  TTic  input  cell  for  CHI. 

1.  Variables  y.„  and  indicate  the  kind  of  operations  to  be  performed,  y^^  and  can  take  on  the 
values:  insert,  delete,  inclusion  (Operation  3),  report  or  (Operation  2).  The  purpose  of  this 
last  distinction  is  For  the  consistency  of  the  generic  cell.  Indeed,  there  must  be  two  kinds  of  report 
signals.  One  (report)  to  handle  the  general  case,  the  other  (repfold)  to  give  the  additional  signal 
that  the  cell  receiving  it  is  the  left  end  of  the  folding  strip,  and  therefore  should  pass  along  its  own 
contents  to  its  right-hand  side  neighbor  at  the  next  systolic  cycle. 

2.  X..  can  hold  cither  the  coordinates  of  a  point  to  insert,  delete,  or  test  for  inclusion,  or  have  an 

in  . 

arbitrary  value  when  y.^  report.  x„,.  serves  the  same  function:  however,  when  y.=  inclusion, 

^oui  query  point  M  and  the  point  R,  for  mturc  setting  of  the  triplet  T.  For 

simplicity,  we  will  represent  x^^  as  (M,R,0).  Similarly,  when  y.^^  report,  we  have  x^j*-*(R,0,0), 

3.  R  is  a  register  with  the  coordinates  of  a  point  When  this  point  is  deleted  from  the  structure,  R  is 
marked  as  e  to  signify  that  the  cell  is  vacant 

E  is  a  symbol  used  systematically  to  denote  on  art>itniry  value  without  significance  to  the  computation.  To 
shorten  the  description  of  the  algorithms,  we  assume,  throughout  the  paper,  that  all  the  output  variables 
^^out’^out’”"^  not  explicitly  set  to  any  value  in  the  algorithm  arc  actually  set  to  e. 


ITie  Algorithm 


ify.^s;/>iserf 

thenifRse 

then  "vacancy"  R^-Xj^ 

else  you, ’'out*-*in 

delete 

thcnifRsXj^ 
then  R«— e 

else  ^^^delete,  x^,-Xj^ 

\Xy.^^  inclusion 

then  y^^*- inclusion-,  Xout^-^^in*^*®) 

if  y.^= report 

then  y^^^  repfold-,  x^j«-(R.0,0) 


2.  The  generic  ceil 

The  generic  cell  is  similar  to  the  input  cell,  with  a  few  addendas.  In  particular,  it  requires  two  more  registers 
T  and  C,  along  with  R.  As  explained  above,  as  points  pass  over  a  generic  ceil  in  the  report  mode,  the  cell 
maintains  a  triplet  of  points  to  know  its  own  status  with  respect  to  the  convex  hull  of  the  passing  points,  hence 
the  role  of  register  T.  T  is  a  pair  of  points  (G.H),  so  that  the  triplet  is  actually  (R.G,H).  When  yj^^ = report  or 
repfold,  it  is  clear  that  y^^  should  be  set  to  report.  However,  in  the  latter  case,  the  cell  must  know  that  it  must 
send  its  contents  at  the  next  systolic  cycle.  For  this  reason,  yj^ = repfold  causes  the  cell  to  set  its  onc*bit  flag  C 
to  1.  in  order  to  remember  to  do  so.  Thus,  at  the  next  cycle,  the  cell  will  send  the  contents  of  its  register  R  to 
its  right-hand  side  neighbor,  along  with  a  repyb/d  signal.  Note,  however,  that  only  occupied  cells  do  fold  over. 

When  a  cell  dctcnnincs  that  either  the  point  it  is  currently  storing,  or  the  point  passing  by  lies  inside  the 
convex  hull,  it  sets  some  appropriate  flag  to  avoid  further  computation.  More  precisely,  in  inclusion,  report,  or 
repfold  mode,  y^^  is  set  to  ihruinclusion  in  the  first  case  and  thrureport  in  the  two  others,  so  as  to  notice 
forthcoming  cells  to  abstain  from  any  unnecessary  work.  Similarly,  if  this  happens  with  respect  to  a  cell  which 
has  not  started  moving,  T  is  set  to  nonconv. 
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The  Algorithm 


if  y.^= insert 
thenifRse 

then  "vacancy” 

else 

^oul*”*in 

if  y.^= delete 
then  if  R=Xj|| 
then  R«—e 
else  y^i-delete 

X  ^X. 

oul  in 


then 

LetXj„=(MAB) 
if  (A,B) = ( nonconvj)) 
then  "y.^^repfoUr 
y^i^thrureport 

if  {y.^mihrureporl)A(A*nonconv) 

then 

begin 

Let  P  =  [F(A.M,B)=F(R.M.B)1 
Q  =  [F(B.M.A)=F(R.MA1 


ifR=e 

then  "empty  celt  *  pass  along" 
^oui^^in 

stop 


Vy^^  inclusion 

then  Let  x.  =  (M,A.B) 

P  s  (F<A,M,B)sF(R,M.fl)J 
Q  a  (F(B.M^)=F(R,M,A)1 
ifPAQ 
then 


if-«PAQ 

then 


'out 

out 


inclusion 


y  o— inclusion 

*out^<M^,R) 

lfPA-.Q 

then 


y  *- inclusion 
y  thruinclusion 

*  OUl 


if  y  .  =  thruinclusion 
in 

then 

^out^^in 

*o«t^’'in 

if  (y.^  =  reporiMy.  =  repfold) 
w(y^  =  thrureport) 


then 


yout^^po'^ 

‘out^*in 


ten 


yout^^p®^ 

x^-(M.A.R) 


a 

y^*-report 

XoutHM.B.R) 


IfTse 

then 

TMM.0) 

ifT=(G.O) 

then 

T^(Gjyi) 

ifT=(G.H) 

then 

LctV  =  [F(G,R.H)=F(M.R,H)1 
W  =  [F(H,R.G)=F(M.R,G)1 
begin 

if-.VAW 

then  T*— (G.M) 

ifVA-.W 

then  T-(H,M) 
lf-.VA->W 


end 


then  T*—nonconv 


Uy.=t/tnireport 

then 

y  *-thrureport 

*our(M.O.O) 

)Sy.  =  rep fald 
then  C^l 


ifC=l 


then  "By  convention,  y.  should  be  e." 
C«— 0 

lf(T=/io«tfo/iv) 


then 


T*— (nonco/iv,0) 
■repfold 
(lUT) 


Note  that  in  repfold  or  report  mode,  the  first  two  generic  ceils,  and  in  inclusion  mode,  the  first  generic  cell, 
do  not  receive  a  full  triplet  (Mw\,B)  as  x.^,  therefore  the  first  two  generic  cells  do  not  have  to  execute  the  pan 
of  code  for  checking  local  convexity. 

3.  The  output  cell 

The  output  cell  is  basically  a  simplified  version  of  the  generic  cell.  In  particular,  does  not  need  to  be  a 
triplet  when  the  cell  is  in  report  or  inclusion  mode.  We  still  give  the  algorithm  for  the  sake  of  completeness. 

The  Algorithm 


it  y.^  =  insert 
then 

if  R=« 
then 

else  y^^*— overflow 


If y.  ^delete 
then 

if  R=x,„ 
then 

R«-e 

else 


out 


'to 


if  y.  =  inclusion 
then 

Let  Xi|j=(M.A.B) 

P  =  [F(A.M.B)  =  F(R,M.B)] 
Q  =  [F(B;M,A)=F(R.M,A)] 


then  y^^*- outside 

stop 

if-iPA-iQ 

then  y  *— inside 


y  nodeletion 
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else  y^<i-ouiside 

if  y  =  thruinchaion 
then 

*oui*~*m 


if  (yjn  *  reportMy.9  repfoid^ 
\/(y  s  thrunpolv 

then 


Letx.  s:(MAB) 
if  (A,B)s(  wncoAv.O) 
then 


If  (y,^*thrureport)A(A*nonconv) 

then 

betin 


Let  P  =  [F(A,M,B)=F(R>LB)1 
Q  =  [F(B>fj\)*F(R,\LA)l 
if  Rs« 
then 

W/i«ffex 

stop 

if-npA-iQ 

then 


♦-e 


end 


hullvtrtex 

M 


if  y^  =  thrureport 
then 

yout"-* 

ityi^=repjbld 

then 

C^l 

ifCsl  then 
C^O 

if  (T  mnonconv)A{T»e) 
then 

y^*-hullyertex 

else 

yout^* 


ifR«c 
thoi 
if  Ts« 

then  T^(M.O) 
if  T=(G,0) 

then  T-(G.M) 
if  T=(G.H) 

then  Let  V  s  (F(G,R,H)=F(M,R.H)1 
W  s  [F(aR.G)=F(M.R,G)l 
betin 

lf-.VAW 

thenT-(G.M) 

ifVA-iW 

thcnT*-(H.M) 

if--VA-.W 

then  T^nonconv 


end 


28 


II.  The  Algorithm  for  CH2 


For  reasons  which  will  become  apparent  later  on.  the  frequency  of  operations  initiated  on  the  input  cell 
must  follow  the  rules  below: 

1.  After  starting  an  operation  on  the  input  cell,  wait  for  at  least  7  idle  cycles  before  initiating  another 
request 

1  No  operation  can  be  initiated  before  Operation  2  is  completely  finished.  A  special  symbol  end  will 
acknowledge  that  foct 

There  is  nothing  m^c  about  these  figures.  It  is  sufficient  that  a  general  relation,  discussed  later  on.  be 
satisfied,  and  actually,  for  the  sake  of  simplicity,  our  rules  have  been  made  overly  conservative.  Because  of  its 
generality,  we  begin  with  a  description  of  the  generic  cell 

1.  The  generic  cell 

As  shown  in  fig.12,  the  generic  ccB  can  be  described  with  6  basic  variables  and  3  registers  R.F,C,  the  former 
storing  one  edge  (A.B)  of  the  convex  hull.  Testing  the  inclusion  of  a  point  Xj^  s  M  in  the  convex  hull  involves 
having  yj,,  set  to  ihruinclusion  if  non-inclusion  has  already  been  determined,  or  inclusion  otherwise,  in  which 
case,  computing  G(M  AB)  allows  us  to  iterate  on  to  the  next  cell  The  variables  z,,.,  serve  a  double 
purpose.  On  the  one  hand,  if  is  a  pair  (A,B),  the  ceil  is  vacant  (R= e)  and  must  be  filled  (R«— z^).  On  the 
other  hand,  once  a  report  (Operation  2)  has  been  initiated  on  the  input  cell  the  contents  of  each  non-vacant 
cell  vnll  get  to  travel  towards  the  input  cell  to  be  eventually  output  To  distinguish  between  these  two  kinds  of 
leftward  motion,  one  bit  (report)  is  tagged  to  z.y|.  Le.,  (report A,B),  so  that  the  cell  knows  that  it  must  only 
pass  this  value  along  (Zjjyj«-(«porr,A,B)).  Of  course  if  yj^ = report,  is  set  to  (report,R), 


out 


y. 


in 


m 


R.C,F 

Z: 


in 


'out 


out 


Figure  12:  The  generic  cell  for  CH2. 


The  last  ease  to  examine  (Operation  1)  is  by  far  the  most  delicate.  To  decide  the  status  of  a  point  M  in  the 
convex  hull.  Lemma  2  shows  that  4  possible  situations  should  be  considered.  M  traveling  along  the  array  from 
the  input  to  the  output  cell,  let  R  =(AM)  be  the  edge  stored  at  the  cell  currently  visited,  and  let  the  variable  u 
be  set  to  in  if  the  edge  (CA)  of  the  previous  (non-empty)  cell  satisfies  G(M.C.AK0,  or  ou/  otherwise. 
Similarly,  let  v  be  G(M.A,D).  Lemma  2  shows  that  the  following  actions  should  be  taken. 

1.  u  »  //I.  V  >  0  (flg.13.1).  Delete  R  to  replace  its  contents  by  (A,M). 

2.  u  s  {/I,  V  <  0  (fig.I3.2).  No  action. 

3.  u  s  out.  V  >  0  (f!g.l3.3).  Delete  R. 

4.  u  s  out.  V  <  0  (hg.13.4).  Insert  (M.A)  before  R.  Send  R  to  next  cell 


Figure  13:  Establishing  the  status  of  a  new  point. 


Note  that  if  all  4  eases  should  arise,  they  would  occur  with  the  order  (up  to  circular 

permutation).  Since  we  wish  to  pipeline  the  updates,  it  is  very  important  that  as  an  insert-M  operation  travels 
left-to*right,  the  insert  signal,  at  any  time,  leaves  behind  the  exact  clockwise  description  of  the  boundary  as  it 
should  be  after  inserting  M.  For  this  reason,  we  must  ensure  that  if  the  4  cases  should  arise,  they  do  so  in  the 
order 


This  problem  comes  from  the  fact  diat  the  variable  u  cannot  be  computed  for  the  flrst  cell,  since  it  involves 
knowledge  of  the  last  occupied  ceil  in  the  array.  To  overcome  this  difficulty,  we  adopt  a  slightly  different 
representation  of  a  convex  polygon,  which  involves  partitioning  the  boundary  into  two  chains  of  consecutive 
edges.  One,  the  upper  chain,  consists  of  the  upper  edges  of  the  polygon.  Le.,  edges  with  increasing  X* 
coordinates  in  clockwise  order,  the  other,  the  lower  chain,  consists  of  the  lower  edges,  defined  as  the  edges 
pointing  to  the  left  (fig.14). 


We  now  require  that  from  left  to  right,  the  array  CH2  should  store  first  the  edges  of  the  upper  chain,  then 
the  edges  of  the  lower  chaia  both  in  clockwise  order.  Of  course,  we  must  assume  the  presence  of  a  flag 
register  F  in  each  cell,  which  takes  on  the  value  firsiup  (resp.  Jirsilow),  if  the  cell  is  currently  storing  the  first 
edge  of  the  upper  (resp.  lower)  chain.  Otherwise,  F  is  set  to  c.  The  flag  plays  the  role  of  u  for  the  two  edges  in 
the  array  whose  neighbors,  in  counterclockwise  order,  arc  conceptually  two  infinite  vertical  rays. 


Situations  1  and  2  arc  straightforward  to  handle,  unlike  Situation  3  which  creates  "holes'*  in  the  array,  and 
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situation  4  which  adds  one  extra  edge.  In  the  latter  case,  the  edge  R  and  the  flag  F  will  bounce  their  contents 
on  to  the  next  cell,  which  will  store  them  in  its  registers,  and  send  the  former  contents  of  these  registers  to  its 
neighbor.  This  process  will  iterate  until  the  last  celt  (R = end)  has  been  reached,  thus  adding  one  to  the  overall 
cell  occupancy.  While  a  ceil  is  busy  sending  its  contents  to  its  neighbor,  it  must  hold  up  the  insert  request  to 
forward  it  at  the  next  step.  To  do  so.  it  uses  the  third  register  C.  To  handle  Situation  3.  Lc..  to  fill  holes,  we 
require  that  at  the  end  of  the  computation,  each  cell  checks  whether  it  is  vacant  (R  s  e),  in  which  case  it  issues 
a  hole  signal  to  its  right-hand  neighbor  iy ^i*-hole),  provided  that  y^^  has  not  already  been  set  to  another 
value  (e.g.  a  query/update  signal).  Upon  receiving  a  hole  signal  (y^^  hole),  the  cell  must  empty  its  register  T 
onto  its  left-hand  neighbor  R).  One  major  difficulty  is  that,  with  a  naive  implementation,  a  right- 
moving  query/update  may  miss  some  left-moving  edges.  To  circumvent  this  pitfall  we  reserve  the  odd 
systolic  cycles  for  ail  leftward  transfers,  and  the  even  cycles  for  the  remaining  computations. 


The  Algorithm 

We  assume  that,  during  even  cycles,  all  I/O  variables  not  explicitly  assigned  to  any  value  are  set  to  e  -  note 
that  allowing  this  to  happen  during  odd  cycles  would  have  disastrous  effects. 
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2.  The  input  and  output  cells 

Wc  need  not  give  the  details  of  the  algorithms  for  these  cells,  since  they  arc  merely  simplified  versions  of 
the  generic  cell.  Before  compuution  starts,  we  assume  the  presence  of  K in  tlic  input  cell.  For  this  cell, 
the  odd  cycles  will  be  idle,  and  except  for  a  special  treatment  for  the  first  three  points  entering  the  array,  most 
of  the  behavior  of  this  cell  is  identical  to  that  of  the  generic  cell.  As  for  the  output  cell,  its  most  notable  feature 
is  to  detect  and  repon  possible  overflows,  as  well  as  outputting  an  inclusion  message  if  y^  =  inclusion,  and  a 
non-inclusion  message  if  = ihniinelusion. 

3.  Correctness  of  the  algorithm  for  CH2 

To  begin  with,  wc  should  note  that  along  with  an  insert-M  request,  two  flags  (u  and  w)  should  be  tagged  to 
M.  The  variable  u  is,  as  shown  above,  the  status  in  or  out  of  M  with  respect  to  the  previous  cell,  and  w  is  a  flag 
set  to  firsiup  (resp.  firstlow)  if  the  next  edge  created,  MA.  happens  to  be  the  first  of  the  upper  (resp.  lower) 
chain.  This  information  is  needed  when  the  first  ed^  arc  deleted  by  repeated  occurrences  of  Sioiation  3.  and 
w  is  thus  the  only  way  to  aclcnowlcdge  the  first  new  edge  that  it  is  indeed  the  first  edge  of  a  chain.  It  is 
important  to  realize  chat  filling  holes  with  a  rightward  motion  of  edges  is  meant  only  to  improve  the 
performance  of  the  array,  Le„  put  the  limitation  on  the  size  of  the  convex  hull  rather  than  on  the  number  of 
operations  which  can  be  performed.  For  this  reason,  we  may  first  show  the  corrccmcss  of  the  algorithm  when 
all  the  instructions  related  to  that  hole-filling  job  are  dropped.  This  involves  ignoring  odd  cycles  as  well  as  the 
last  if-statement  of  the  main  algorithm.  The  only  point  remaining  to  be  checked  is  that  y^^  is  always  set  only 
otKe, 

In  order  to  do  so.  we  may  start  with  a  few  helpful  observations.  Let  us  call  an  even  p/tase  the  conjunction  of 
an  even  followed  by  an  odd  cycle.  The  rules  on  operations  rate  specified  above  impose  a  delay  of  at  most  4 
even  phases  between  two  consecutive  operations.  However,  an  insert  operation  may  entail  the  loss  of  one 
phase,  caused  by  the  possible  (unique)  setting  of  C,  thus  reducing  the  above  delay  to  3.  On  the  contrary,  a  cell 
may  issue  a  hole  signal  iyg^*--hole)  possibly  at  every  even  phase,  and  similarly  a  cell  is  in  a  position  to 
respond  to  a  hole  message  at  every  odd  phase  (Zgm’—fR.F))*  From  these  facts,  wc  derive  in  particular  that 
whenever  C«*e.  wc  also  have  yj^j  =  «,  form  which  it  is  easy  to  sec  that  there  is  never  any  conflict  in  setting  y^^. 
Now  including  the  hole-filling  instructions,  we  only  have  to  show  that  there  is  no  conflict  in  setting  the 
register  R,  More  precisely,  we  must  prove  that  whenever  = ( A.B.F)«»e,  we  have  R  =  e.  This  comes  from  the 
fact  that  z^=(A,B.F)  if  and  only  if,  at  the  previous  odd  cycle,  y^^^^  had  the  value  hole.  This,  in  turn,  implies 
that  at  the  end  of  the  previous  even  cycle,  wc  had  R = e.  Since,  in  addition.  R  can  only  be  set  to  e  (if  it  is  ever) 
at  odd  cycles,  our  proof  is  complete. 


The  last  item  to  verify  is  what  precisely  motivated  the  distinction  between  odd  and  even  cycles:  the 


assurance  that  all  right-moving  queries  or  updates  encounter  all  the  edges  of  the  array.  If  Zj^=(A.D.F)'*e.  die 
first  action  taken  by  the  ceil  at  an  even  cycle  is  to  set  (R,F)  to  so  that  a  inclusion  or  insert  operation  at  that 
cycle  will  effectively  deal  with  the  just-left-movcd  edge.  On  the  other  hand,  since  the  edge  leaves  the  cell  only 
under  a  yj^  =  />o/e  situation,  the  cell  will  not  have  to  handle  any  query/update  at  the  next  even  cycle,  so  it  may 
leave  the  cell  without  missing  any  matching,  which  proves  our  claim.  Our  final  investigation  concerns  the 
storage  efficiency  of  the  array.  We  have  claimed  that  no  overfiow  will  ever  occur,  as  long  as  the  number  of 
vertices  in  die  convex  hull,  at  any  time,  docs  not  exceed  N/2.  We  must  now  support  this  claim. 

The  above  assumption  clearly  implies  that  no  more  than  N/2  cells  are  occupied  (Roe)  at  any  time,  since 
inserting  a  vertex  involves,  first,  deleting  old  edges,  then  adding  the  new  ones.  Trouble  may  arise,  however,  if 
edges  tend  to  cluster  towards  the  output  cell.  To  dispel  that  worry,  we  introduce  die  concept  of  leading  fivni, 
defined  as  the  rightmost  cluster  of  occupied  cells,  Le.^  the  rightmost  group  of  cells  without  R=e.  A  leading 
fitint  can  be  characterized  by  the  position  H  of  the  first  cell,  measured  as  its  distance  to  the  input  cell,  along 
with  the  length  L  of  the  cell.  To  prove  the  absence  of  leading  fronts  near  the  output  cell,  hence  the  absence  of 
overfiow,  it  clearly  suffices  to  establish  the  following  result 
Lemma  3:  H+2LsN 

Proof:  To  look  at  the  evolution  of  a  leading  front  suppose  that  the  front  (H,L-1)  just  had  one 
cell  added  to  it  as  the  result  of  an  insertion,  yielding  a  front  (H.L).  From  the  niles,  it  follows  that 
during  the  next  7  cycles,  no  more  cell  can  be  added  to  the  front  However,  a  hole  signal  will 
necessarily  be  transmitted  to  the  leftmost  cell  of  the  front  during  the  first  two  even  cycles, 
therefore  this  cell  will  be  detached  from  the  front  by  the  second  odd  cycle,  at  the  latest  For  the 
same  reason,  a  hole  signal  will  reach  the  new  leftmost  cell  of  the  front  by  the  4th  even  cycle  at  the 
latest  therefore  this  cell  will  also  detach  itself  before  the  7  cycles  are  elapsed,  thus  leaving  a  front 
(H  2,L)  in  the  worst  case,  which  completes  die  proof.  □ 

It  is  easy  to  generalize  the  rules  specified  above,  which  may  be  useful  for  tuning  the  algorithms  according  to 
the  average  distribution  of  requests.  Let  A  be  the  number  of  cells  in  the  systolic  array,  and  let  a  be  the  ratio 
speed  of  head/ speed  of  tail.  If  we  wish  to  allow  up  to  N  convex  hull  vertices  in  the  array,  at  any  time,  we  must 
have  the  relation  aA^A-N,  hence  a^l-N/A,  satisfied.  On  the  other  hand,  if  a  (resp.  b)  is  the  delay  jneasured 
in  number  of  phases,  imposed  between  consecutive  insert  (resp.  inclusion)  operations,  the  following  relation 
must  hold. 

1/a  s  a(Hl/a+l/b)) 

that  is, 

1/a  <  (l-N/AKHl/a-Kl/b)). 


Hi.  The  Algorithm  for  INT2 


Wc  give  only  the  algorithm  for  the  insert  operation,  the  others  being  handled  in  a  way  strictly  similar  to 
CH2.  When  the  line  L  delimiting  the  half-plane  H  to  be  inserted  intersects  the  current  polygon  I,  the 
intersection  consists  (in  general)  of  a  segment  VW,  which  must  be  added  to  the  array.  To  do  so,  it  suffices  to 
tag  the  first  intersection  point  encountered,  V  or  W,  along  with  the  half-plane  H,  as  it  travels  left-lo-right,  in 
order  to  insert  VW  into  the  array  as  soon  as  the  other  end-point  can  be  computed  (case  1  or  4).  As  usual,  note 
the  presence  of  the  register  C  to  buffer  out  the  delay  caused  by  an  insertion.  Sec  fig.lS. 
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Fienrf  15:  The  various  C;'iscs  for  1NT2. 
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